Semi-Supervised Learning For Sentiment Analysis
نویسندگان
چکیده
We leverage vector space embeddings of sentences and nearest-neighbor methods to transform a small amount of labelled training data into a significantly larger training set using an unlabelled corpus. The quality of the larger training set is measured by prediction accuracy on a benchmark sentiment analysis task. Our results indicate it is possible to achieve accuracy within 3-5% of the baseline using only 5-8% the amount of labelled data.
منابع مشابه
یک چارچوب نیمهنظارتی مبتنی بر لغتنامه وفقی خودساخت جهت تحلیل نظرات فارسی
With the appearance of Web 2.0 and 3.0, users’ contribution to WWW has created a huge amount of valuable expressed opinions. Considering the difficulty or impossibility of manually analyzing such big data, sentiment analysis, as a branch of natural language processing, has been highly considered. Despite the other (popular) languages, a limited number of research studies have been conducted in ...
متن کاملSemi-supervised Probabilistic Sentiment Analysis: Merging Labeled Sentences with Unlabeled Reviews to Identify Sentiment
Document level sentiment analysis, the task of determining whether the sentiment expressed in a document is positive or negative, is commonly performed by supervised methods. As with all supervised tasks, obtaining training data for these methods can be expensive and timeconsuming. Some semi-supervised approaches have been proposed that rely on sentiment lexicons. We propose a novel supervised ...
متن کاملSentiment Analysis by Augmenting Expectation Maximisation with Lexical Knowledge
Sentiment analysis of documents aims to characterise the positive or negative sentiment expressed in documents. It has been formulated as a supervised classification problem, which requires large numbers of labelled documents. Semi-supervised sentiment classification using limited documents or words labelled with sentiment-polarities are approaches to reducing labelling cost for effective learn...
متن کاملMore Is Better: Large Scale Partially-supervised Sentiment Classication
We describe a bootstrapping algorithm to learn from partially labeled data, and the results of an empirical study for using it to improve performance of sentiment classification using up to 15 million unlabeled Amazon product reviews. Our experiments cover semi-supervised learning, domain adaptation and weakly supervised learning. In some cases our methods were able to reduce test error by more...
متن کاملIncremental Learning on Sentiment Analysis Using Weakly Supervised Learning Techniques
Due to the advanced technologies of Web 2.0, people are participating in and exchanging opinions through social media sites such as Web forums and Weblogs etc., Classification and Analysis of such opinions and sentiment information is potentially important for both service and product providers, users because this analysis is used for making valuable decisions. Sentiment is expressed differentl...
متن کامل